INTRODUCTION Phylogenetic Community Structure of Anolis Genus—- Author: Nandini Lohia Purpose: Comprehensive analysis of Anolis species distribution and phylogenetic relationships Research Questions: 1.What is the geographical distribution of Anolis species? 2.What are the evolutionary relationships among Anolis species? 3.How are Anolis species distributed across different communities?

Sources for the project methodology include: Phylogenetic comparative methods in ecology and evolution (Revell, 2014) Community phylogenetics: concepts and approaches (Webb et al., 2002) Bioinformatics sequence analysis (Durbin et al., 1998) Phylogenetic Community Structure Analysis in R GitHub Repository https://github.com/nlohia/Phylogenetic-Community-structure-of-the-Genus-Anolis-

1.Introduction The Anolis genus represents a fascinating model system for studying evolutionary biology, ecological diversity, and community structure across different geographical regions. Anolis lizards, widely distributed throughout the Americas, offer an excellent opportunity to explore patterns of species distribution, genetic diversity, and phylogenetic relationships. This study aims to unravel the complex phylogenetic community structure of Anolis species by integrating occurrence data, genetic sequences, and advanced phylogenetic analytical techniques. The overarching research objectives of this project include:

Investigating the geographical distribution of Anolis species Constructing a robust phylogenetic tree to understand evolutionary relationships Analyzing community structure across different geographical regions Examining phylogenetic diversity and its implications for species distribution

By employing comprehensive bioinformatics approaches, we seek to address critical questions in evolutionary biology and community ecology. Specifically, our research will explore how historical evolutionary processes and geographical constraints shape the distribution and diversity of Anolis lizards.

  1. Description of Data Set The dataset for this project comprises two primary sources of biological information:

Occurrence Data: Retrieved from the Global Biodiversity Information Facility (GBIF), the occurrence data includes geographical coordinates, species names, country, and state/province information. The dataset was collected on the date of script execution, filtering for Anolis species with valid geographical coordinates. Genetic Sequences: 16S rRNA gene sequences were obtained from GenBank(NCBI), focusing on Anolis species. The sequences underwent rigorous quality control to ensure data reliability and representativeness.

Key properties of the dataset include:

Multiple Anolis species across various geographical regions Comprehensive geographical coverage Genetic sequences representing different species Clean, filtered data with quality control measures applied

  1. Code Section 1: Data Acquisition, Exploration, Filtering, and Quality Control Data Acquisition Strategy Used rgbif package for occurrence data retrieval and employed rentrez for genetic sequence acquisition Applied strict filtering criteria to ensure data quality

Quality Control Measures-

For Occurrence Data:Removed entries with missing latitude/longitude Ensured distinct species entries Filtered for complete geographical information For Genetic Sequence Data:Removed sequences containing ‘N’ (missing bases) Filtered sequences with excessive gaps (>50%) Excluded duplicate and short sequences (<100 bases) Standardized species name formatting Visualization of Data Quality (Sequence Quality Control In detail Sequence filtering arguments: max_gap_percentage = 0.5: Allows up to 50% gaps to retain more sequences while ensuring data quality min_length = 100: Removes very short sequences that might represent incomplete or low-quality genetic data Rationale: Balances data retention with maintaining sequence reliability)

## 1. PACKAGE INSTALLATION AND DEPENDENCY MANAGEMENT----
# Install and load necessary packages for data manipulation, biological analysis, phylogenetic analysis, spatial analysis, and visualization
if (!requireNamespace("pacman", quietly = TRUE)) install.packages("pacman")
pacman::p_load(
  # Data manipulation and visualization
  tidyverse, dplyr, tidyr, 
  
  # Biological sequence analysis
  Biostrings, DECIPHER, seqinr, rentrez, msa, muscle,
  
  # Phylogenetic analysis
  ape, phangorn, treeio, picante,
  
  # Spatial and ecological analysis
  rgbif, sf, vegan, 
  
  # Statistical and computational tools
  cluster, factoextra, future, furrr, parallel,
  
  # Advanced visualization
  plotly, RColorBrewer, ggtree, tidytree, ggplot2
)

## 2. SET RANDOM SEEDS FOR REPRODUCIBILITY----
# Setting random seeds for reproducibility of analysis at different stages
set.seed(123)  # For community analysis
set.seed(42)   # For phylogenetic analysis

## 3. OCCURRENCE DATA RETRIEVAL----
# Research Question: What is the geographical distribution of Anolis species?
# Retrieve occurrence data from Global Biodiversity Information Facility (GBIF)
# Filtering and cleaning the data for relevant columns: species, latitude, longitude, country, state, and IUCN status
occ_search_results <- occ_search(scientificName = "Anolis", hasCoordinate = TRUE, limit = 500)
occurrence_data <- occ_search_results$data %>%
  filter(!is.na(decimalLatitude), !is.na(decimalLongitude), !is.na(scientificName)) %>%
  distinct(species, decimalLatitude, decimalLongitude, .keep_all = TRUE) %>%
  dplyr::select(species, decimalLatitude, decimalLongitude, country, stateProvince, iucnRedListCategory)

# View the cleaned occurrence data
head(occurrence_data)
## # A tibble: 6 × 6
##   species             decimalLatitude decimalLongitude country     stateProvince
##   <chr>                         <dbl>            <dbl> <chr>       <chr>        
## 1 Anolis carolinensis            28.1            -82.6 United Sta… Florida      
## 2 Anolis carolinensis            32.6            -80.1 United Sta… South Caroli…
## 3 Anolis sagrei                  26.1            -80.1 United Sta… Florida      
## 4 Anolis biporcatus              16.8            -88.4 Belize      Stann Creek  
## 5 Anolis carolinensis            25.7            -80.3 United Sta… Florida      
## 6 Anolis sagrei                  27.8            -82.6 United Sta… Florida      
## # ℹ 1 more variable: iucnRedListCategory <chr>
## 4. SEQUENCE DATA RETRIEVAL AND PROCESSING----
# Research Questions: Quality control of genetic sequences from GenBank
# Example commented-out code to search for specific gene sequences (e.g., 16S rRNA) for Anolis species
#Anolis_16S_search <- entrez_search(db = "nuccore", term = "Anolis AND 16S", retmax = 300)
#Anolis_16S_search1 <- entrez_search(db = "nuccore", term = "Anolis AND 16S AND 400:650[SLEN]", retmax = 300)

# Fetch sequences (this part is commented out as it's hypothetical)
#Anolis_16S_sequences <- entrez_fetch(db = "nuccore", id = Anolis_16S_search1$ids, rettype = "fasta", retmode = "text")

# Load the sequences into R
Anolis_16S_summ <- readDNAStringSet("./data/Anolis_16S_sequences.fasta")

#### Quality Control and Filtering of Sequences----
# Remove sequences with 'N' (missing bases), sequences with excessive gaps (>50%), and duplicates
Anolis_16S_summ_cleaned <- Anolis_16S_summ[!grepl("N", Anolis_16S_summ)]  # Remove 'N'
max_gap_percentage <- 0.5  # Allow only 50% gaps
Anolis_16S_summ_cleaned <- Anolis_16S_summ_cleaned[sapply(Anolis_16S_summ_cleaned, function(seq) {
  sum(strsplit(as.character(seq), NULL)[[1]] == "-") / length(seq) < max_gap_percentage
})]

# Remove duplicate sequences and sequences shorter than 100 bases
Anolis_16S_summ_cleaned <- unique(Anolis_16S_summ_cleaned)
min_length <- 100
Anolis_16S_summ_cleaned <- Anolis_16S_summ_cleaned[nchar(Anolis_16S_summ_cleaned) > min_length]

# Create a dataframe for cleaned sequences
dfAnolis_16S <- data.frame(Anolis16S_Title = names(Anolis_16S_summ_cleaned), 
                           Anolis_16S_Sequence = paste(Anolis_16S_summ_cleaned))
dfAnolis_16S$Species_Name <- word(dfAnolis_16S$Anolis16S_Title, 2L, 3L)

# Clean species names and filter valid species
dfAnolis_16S <- dfAnolis_16S %>%
  mutate(Species_Name = gsub("^A\\.\\s*", "Anolis ", Species_Name)) %>%
  mutate(Species_Name = word(Species_Name, 1, 2)) %>%
  filter(grepl("^Anolis\\s\\w+", Species_Name)) %>%
  select(Anolis16S_Title, Species_Name, Anolis_16S_Sequence)

# View the cleaned sequence data
head(dfAnolis_16S)
##                                                                                        Anolis16S_Title
## 1     MH140619.1 Anolis gaigei voucher CH 5426 16S ribosomal RNA gene, partial sequence; mitochondrial
## 2 MH140625.1 Anolis vittigerus voucher CH 6878 16S ribosomal RNA gene, partial sequence; mitochondrial
## 3 MH140624.1 Anolis vittigerus voucher CH 8821 16S ribosomal RNA gene, partial sequence; mitochondrial
## 4 MH140623.1 Anolis vittigerus voucher CH 4954 16S ribosomal RNA gene, partial sequence; mitochondrial
## 5 MH140622.1 Anolis vittigerus voucher CH 5037 16S ribosomal RNA gene, partial sequence; mitochondrial
## 6 MH140621.1 Anolis vittigerus voucher CH 5160 16S ribosomal RNA gene, partial sequence; mitochondrial
##        Species_Name
## 1     Anolis gaigei
## 2 Anolis vittigerus
## 3 Anolis vittigerus
## 4 Anolis vittigerus
## 5 Anolis vittigerus
## 6 Anolis vittigerus
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Anolis_16S_Sequence
## 1 AGCCTTTAGCAAAACAAGTATTAAAGGTAACGCCTGCCCAGTGAAATTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATGAAGACCTGTATGAATGGCTATATGAGTATTTAACTGTCTCCTTTAACTAATCAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATCATCATAAGACGAGAAGACCCTGTGGAGCTTTAAATTTTTAACAAAGTATCACTAAACAGACGCTTATGATAAAAAATCTTTAGTTGGGGCGACTTTGGAGCAAAACTTAACCTCCAAGATAAAAGTACCACCTAATTTCAGGCTCACAAGCCGAACCTTATAGACCCAGTATTAAATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCAAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGAAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAACAGTCCT
## 2  AGCCTTTAGCAAAACAAGTATTAAAGGTGACGCCTGCCCAGTGAAATTTTAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAATATTTAACTGTCTCCTTTAACTAATTAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATTAATATAAGACGAGAAGACCCCGTGGAGCTTTAAATTTTTAACAAGGTGTTACAAAAAGGTACCTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGTACAACTAAACCTCCAAGAAAAGGCATTGCCTAAACCTTAGGCTTACAAGCCAAACCACATAGACCCAGTATTAAATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCTATCTTCTTCAAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAACAGTCCT
## 3  AGCCTTTAGCAAAACAAGTATTAAAGGTGACGCCTGCCCAGTGAAATTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAGTATTTAACTGTCTCCTTTAACTAATTAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATTAACATAAGACGAGAAGACCCCGTGGAGCTTTAAATTTTTAACAAAGTATTACAAAAAAGTACCTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGTACAACTAAACCTCCAAGAAAAGGCATTGCCTAAACCTAAGGCTTACAAGCCAAACCATATAGACCCAGTATCAAATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCGAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAACAGTCCT
## 4  AGCCTTTAGCAAAACAAGTATTAAAGGTGACGCCTGCCCAGTGAAATTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAATATTTAACTGTCTCCTTTAACTAATTAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATTAACATAAGACGAGAAGACCCCGTGGAGCTTTAAATTTTTAACAAAGTATTACAAAAAAGTACCTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGTACAACTAAACCTCCAAGAAAAGGCATTGCCTAAACCTCAGGCTTACAAGCCAAACCATATAGACCCAGTATCAAATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCGAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAACAGTCCT
## 5   AGCCTTTAGCAAAACAAGTATTAAAGGTGACGCCTGCCCAGTGAAATTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAATATTTAACTGTCTCCTTTAACTAATTAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATTAACATAAGACGAGAAGACCCCGTGGAGCTTTAAATTTTTAACAAAGTATTACAAAAAGTACCTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGTACAACTAAACCTCCAAGAAAAGGCATTGCCTAAACCTCAGGCTTACAAGCCAAACCATATAGACCCAGTATCAAATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCGAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAACAGTCCT
## 6   AGCCTTTAGCAAAACAAGTATTAAAGGTGACGCCTGCCCAGTGAAATTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAATATTTAACTGTCTCCTTTAACTAATTAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATTAACATAAGACGAGAAGACCCCGTGGAGCTTTAAATTTTTAACAAAGTATTACAAAAAGTACCTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGTACAACTAAACCTCCAAGAAAAGACATTGCCTAAACCTCAGGCTTACAAGCCAAACCATATAGACCCAGTATCAAATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCGAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAACAGTCCT
# 5. PHYLOGENETIC TREE CONSTRUCTION----
# Research Questions: Evolutionary relationships of Anolis species
# Create the DNAStringSet object for the cleaned sequence data
dna_sequences <- DNAStringSet(dfAnolis_16S$Anolis_16S_Sequence)
names(dna_sequences) <- dfAnolis_16S$Species_Name

# Remove duplicate species names
dfAnolis_16S_unique <- dfAnolis_16S[!duplicated(dfAnolis_16S$Species_Name), ]
dna_sequences_unique <- DNAStringSet(dfAnolis_16S_unique$Anolis_16S_Sequence)
names(dna_sequences_unique) <- dfAnolis_16S_unique$Species_Name
  1. Main Software Tools Description In this assignment, RStudio was the primary and only tool used for data analysis, visualization, and bioinformatics tasks. The following R packages were employed for various functions: tidyverse, dplyr, and tidyr for data manipulation and visualization; Biostrings, DECIPHER, seqinr, rentrez, msa, and muscle for biological sequence analysis; ape, phangorn, treeio, and picante for phylogenetic analysis; rgbif, sf, and vegan for spatial and ecological analysis; cluster, factoextra, future, furrr, and parallel for statistical and computational tasks; and plotly, RColorBrewer, ggtree, tidytree, and ggplot2 for advanced data visualization. RStudio provided a versatile environment for integrating these packages to conduct the comprehensive analysis required for this assignment.

  2. Code Section 2: Main Analysis After this the code moves forward to show the Phylogenetic Realation among the species of the Genus Anolis.In this study, two phylogenetic tree construction methods, Neighbor-Joining (NJ) and Maximum Likelihood (ML), were employed to understand the evolutionary relationships within the Anolis genus. The NJ tree, known for its computational efficiency, provided a quick overview of species relationships based on a distance-based approach and the Jukes-Cantor model, assuming a constant evolutionary rate. In contrast, the ML tree, using the General Time Reversible (GTR) model with optimizations, offered a detailed exploration of evolutionary processes. Comparing both methods ensured cross-verification of tree topologies, providing more reliable and nuanced evolutionary insights, while mitigating potential biases. And in both the trees the sister species that are closely related are exact same i.e. Anolis cryptolimifrons and Anolis apletophalus.

Phylogenetic Tree Construction Distance matrix (dist.ml()): model = “JC69”: Jukes-Cantor model chosen as a simple, neutral substitution model Justification: Assumes equal mutation rates across all sites, suitable for initial phylogenetic exploration

Maximum Likelihood optimization: model = “GTR”: General Time Reversible model selected for more complex evolutionary patterns optInv = TRUE: Allows for invariant sites in the model optGamma = TRUE: Accommodates rate variation across sites Rationale: Provides a more nuanced representation of evolutionary processes

# Perform multiple sequence alignment using the MUSCLE algorithm
aligned_sequences_unique <- muscle::muscle(dna_sequences_unique)
## 
## MUSCLE v3.8.31 by Robert C. Edgar
## 
## http://www.drive5.com/muscle
## This software is donated to the public domain.
## Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.
## 
## file96942d870fe 73 seqs, max length 560, avg  length 486
## 0 MB(0%)00:00:00                Iter   1    0.04%  K-mer dist pass 1346 MB(11%)00:00:00                Iter   1   18.55%  K-mer dist pass 1346 MB(11%)00:00:00                Iter   1   37.06%  K-mer dist pass 1346 MB(11%)00:00:00                Iter   1   55.57%  K-mer dist pass 1346 MB(11%)00:00:00                Iter   1   74.08%  K-mer dist pass 1346 MB(11%)00:00:00                Iter   1   92.60%  K-mer dist pass 12255 MB(73%)00:00:00                Iter   1  100.00%  K-mer dist pass 1
## 2255 MB(73%)00:00:00                Iter   1    0.04%  K-mer dist pass 22255 MB(73%)00:00:00                Iter   1   18.55%  K-mer dist pass 22255 MB(73%)00:00:00                Iter   1   37.06%  K-mer dist pass 22255 MB(73%)00:00:00                Iter   1   55.57%  K-mer dist pass 22255 MB(73%)00:00:00                Iter   1   74.08%  K-mer dist pass 22255 MB(73%)00:00:00                Iter   1   92.60%  K-mer dist pass 22255 MB(73%)00:00:00                Iter   1  100.00%  K-mer dist pass 2
## 2707 MB(87%)00:00:00                Iter   1    1.39%  Align node       2707 MB(87%)00:00:00                Iter   1    2.78%  Align node2707 MB(87%)00:00:00                Iter   1    4.17%  Align node2707 MB(87%)00:00:00                Iter   1    5.56%  Align node2707 MB(87%)00:00:00                Iter   1    6.94%  Align node2707 MB(87%)00:00:00                Iter   1    8.33%  Align node2707 MB(87%)00:00:00                Iter   1    9.72%  Align node2707 MB(87%)00:00:00                Iter   1   11.11%  Align node2707 MB(87%)00:00:00                Iter   1   12.50%  Align node2707 MB(87%)00:00:00                Iter   1   13.89%  Align node2707 MB(87%)00:00:00                Iter   1   15.28%  Align node2707 MB(87%)00:00:00                Iter   1   16.67%  Align node2707 MB(87%)00:00:00                Iter   1   18.06%  Align node2707 MB(87%)00:00:00                Iter   1   19.44%  Align node2707 MB(87%)00:00:00                Iter   1   20.83%  Align node2707 MB(87%)00:00:00                Iter   1   22.22%  Align node2707 MB(87%)00:00:00                Iter   1   23.61%  Align node2707 MB(87%)00:00:00                Iter   1   25.00%  Align node2707 MB(87%)00:00:00                Iter   1   26.39%  Align node2734 MB(88%)00:00:00                Iter   1   27.78%  Align node2740 MB(88%)00:00:00                Iter   1   29.17%  Align node2742 MB(88%)00:00:00                Iter   1   30.56%  Align node2742 MB(88%)00:00:00                Iter   1   31.94%  Align node2742 MB(88%)00:00:00                Iter   1   33.33%  Align node2742 MB(88%)00:00:00                Iter   1   34.72%  Align node2742 MB(88%)00:00:00                Iter   1   36.11%  Align node2742 MB(88%)00:00:00                Iter   1   37.50%  Align node2758 MB(89%)00:00:00                Iter   1   38.89%  Align node2761 MB(89%)00:00:00                Iter   1   40.28%  Align node2761 MB(89%)00:00:00                Iter   1   41.67%  Align node2761 MB(89%)00:00:00                Iter   1   43.06%  Align node2761 MB(89%)00:00:00                Iter   1   44.44%  Align node2761 MB(89%)00:00:00                Iter   1   45.83%  Align node2761 MB(89%)00:00:00                Iter   1   47.22%  Align node2775 MB(89%)00:00:00                Iter   1   48.61%  Align node2775 MB(89%)00:00:00                Iter   1   50.00%  Align node2775 MB(89%)00:00:00                Iter   1   51.39%  Align node2775 MB(89%)00:00:00                Iter   1   52.78%  Align node2775 MB(89%)00:00:00                Iter   1   54.17%  Align node2775 MB(89%)00:00:00                Iter   1   55.56%  Align node2775 MB(89%)00:00:00                Iter   1   56.94%  Align node2775 MB(89%)00:00:00                Iter   1   58.33%  Align node2775 MB(89%)00:00:00                Iter   1   59.72%  Align node2775 MB(89%)00:00:00                Iter   1   61.11%  Align node2775 MB(89%)00:00:00                Iter   1   62.50%  Align node2775 MB(89%)00:00:00                Iter   1   63.89%  Align node2775 MB(89%)00:00:00                Iter   1   65.28%  Align node2775 MB(89%)00:00:00                Iter   1   66.67%  Align node2775 MB(89%)00:00:00                Iter   1   68.06%  Align node2775 MB(89%)00:00:00                Iter   1   69.44%  Align node2775 MB(89%)00:00:00                Iter   1   70.83%  Align node2775 MB(89%)00:00:00                Iter   1   72.22%  Align node2775 MB(89%)00:00:00                Iter   1   73.61%  Align node2775 MB(89%)00:00:00                Iter   1   75.00%  Align node2775 MB(89%)00:00:00                Iter   1   76.39%  Align node2775 MB(89%)00:00:00                Iter   1   77.78%  Align node2775 MB(89%)00:00:00                Iter   1   79.17%  Align node2775 MB(89%)00:00:00                Iter   1   80.56%  Align node2775 MB(89%)00:00:00                Iter   1   81.94%  Align node2775 MB(89%)00:00:00                Iter   1   83.33%  Align node2775 MB(89%)00:00:00                Iter   1   84.72%  Align node2775 MB(89%)00:00:00                Iter   1   86.11%  Align node2775 MB(89%)00:00:00                Iter   1   87.50%  Align node2775 MB(89%)00:00:00                Iter   1   88.89%  Align node2775 MB(89%)00:00:00                Iter   1   90.28%  Align node2775 MB(89%)00:00:00                Iter   1   91.67%  Align node2775 MB(89%)00:00:00                Iter   1   93.06%  Align node2775 MB(89%)00:00:00                Iter   1   94.44%  Align node2775 MB(89%)00:00:00                Iter   1   95.83%  Align node2775 MB(89%)00:00:00                Iter   1   97.22%  Align node2775 MB(89%)00:00:00                Iter   1   98.61%  Align node2775 MB(89%)00:00:00                Iter   1  100.00%  Align node2775 MB(89%)00:00:00                Iter   1  100.00%  Align node
## 2775 MB(89%)00:00:00                Iter   1    1.37%  Root alignment2775 MB(89%)00:00:00                Iter   1    2.74%  Root alignment2775 MB(89%)00:00:00                Iter   1    4.11%  Root alignment2775 MB(89%)00:00:00                Iter   1    5.48%  Root alignment2775 MB(89%)00:00:00                Iter   1    6.85%  Root alignment2775 MB(89%)00:00:00                Iter   1    8.22%  Root alignment2775 MB(89%)00:00:00                Iter   1    9.59%  Root alignment2775 MB(89%)00:00:00                Iter   1   10.96%  Root alignment2775 MB(89%)00:00:00                Iter   1   12.33%  Root alignment2775 MB(89%)00:00:00                Iter   1   13.70%  Root alignment2775 MB(89%)00:00:00                Iter   1   15.07%  Root alignment2775 MB(89%)00:00:00                Iter   1   16.44%  Root alignment2775 MB(89%)00:00:00                Iter   1   17.81%  Root alignment2775 MB(89%)00:00:00                Iter   1   19.18%  Root alignment2775 MB(89%)00:00:00                Iter   1   20.55%  Root alignment2775 MB(89%)00:00:00                Iter   1   21.92%  Root alignment2775 MB(89%)00:00:00                Iter   1   23.29%  Root alignment2775 MB(89%)00:00:00                Iter   1   24.66%  Root alignment2775 MB(89%)00:00:00                Iter   1   26.03%  Root alignment2775 MB(89%)00:00:00                Iter   1   27.40%  Root alignment2775 MB(89%)00:00:00                Iter   1   28.77%  Root alignment2775 MB(89%)00:00:00                Iter   1   30.14%  Root alignment2775 MB(89%)00:00:00                Iter   1   31.51%  Root alignment2775 MB(89%)00:00:00                Iter   1   32.88%  Root alignment2775 MB(89%)00:00:00                Iter   1   34.25%  Root alignment2775 MB(89%)00:00:00                Iter   1   35.62%  Root alignment2775 MB(89%)00:00:00                Iter   1   36.99%  Root alignment2775 MB(89%)00:00:00                Iter   1   38.36%  Root alignment2775 MB(89%)00:00:00                Iter   1   39.73%  Root alignment2775 MB(89%)00:00:00                Iter   1   41.10%  Root alignment2775 MB(89%)00:00:00                Iter   1   42.47%  Root alignment2775 MB(89%)00:00:00                Iter   1   43.84%  Root alignment2775 MB(89%)00:00:00                Iter   1   45.21%  Root alignment2775 MB(89%)00:00:00                Iter   1   46.58%  Root alignment2775 MB(89%)00:00:00                Iter   1   47.95%  Root alignment2775 MB(89%)00:00:00                Iter   1   49.32%  Root alignment2775 MB(89%)00:00:00                Iter   1   50.68%  Root alignment2775 MB(89%)00:00:00                Iter   1   52.05%  Root alignment2775 MB(89%)00:00:00                Iter   1   53.42%  Root alignment2775 MB(89%)00:00:00                Iter   1   54.79%  Root alignment2775 MB(89%)00:00:00                Iter   1   56.16%  Root alignment2775 MB(89%)00:00:00                Iter   1   57.53%  Root alignment2775 MB(89%)00:00:00                Iter   1   58.90%  Root alignment2775 MB(89%)00:00:00                Iter   1   60.27%  Root alignment2775 MB(89%)00:00:00                Iter   1   61.64%  Root alignment2775 MB(89%)00:00:00                Iter   1   63.01%  Root alignment2775 MB(89%)00:00:00                Iter   1   64.38%  Root alignment2775 MB(89%)00:00:00                Iter   1   65.75%  Root alignment2775 MB(89%)00:00:00                Iter   1   67.12%  Root alignment2775 MB(89%)00:00:00                Iter   1   68.49%  Root alignment2775 MB(89%)00:00:00                Iter   1   69.86%  Root alignment2775 MB(89%)00:00:00                Iter   1   71.23%  Root alignment2775 MB(89%)00:00:00                Iter   1   72.60%  Root alignment2775 MB(89%)00:00:00                Iter   1   73.97%  Root alignment2775 MB(89%)00:00:00                Iter   1   75.34%  Root alignment2775 MB(89%)00:00:00                Iter   1   76.71%  Root alignment2775 MB(89%)00:00:00                Iter   1   78.08%  Root alignment2775 MB(89%)00:00:00                Iter   1   79.45%  Root alignment2775 MB(89%)00:00:00                Iter   1   80.82%  Root alignment2775 MB(89%)00:00:00                Iter   1   82.19%  Root alignment2775 MB(89%)00:00:00                Iter   1   83.56%  Root alignment2775 MB(89%)00:00:00                Iter   1   84.93%  Root alignment2775 MB(89%)00:00:00                Iter   1   86.30%  Root alignment2775 MB(89%)00:00:00                Iter   1   87.67%  Root alignment2775 MB(89%)00:00:00                Iter   1   89.04%  Root alignment2775 MB(89%)00:00:00                Iter   1   90.41%  Root alignment2775 MB(89%)00:00:00                Iter   1   91.78%  Root alignment2775 MB(89%)00:00:00                Iter   1   93.15%  Root alignment2775 MB(89%)00:00:00                Iter   1   94.52%  Root alignment2775 MB(89%)00:00:00                Iter   1   95.89%  Root alignment2775 MB(89%)00:00:00                Iter   1   97.26%  Root alignment2775 MB(89%)00:00:00                Iter   1   98.63%  Root alignment2775 MB(89%)00:00:00                Iter   1  100.00%  Root alignment2775 MB(89%)00:00:00                Iter   1  100.00%  Root alignment
## 2775 MB(89%)00:00:00                Iter   2    1.41%  Refine tree   2775 MB(89%)00:00:00                Iter   2    2.82%  Refine tree2775 MB(89%)00:00:00                Iter   2    4.23%  Refine tree2775 MB(89%)00:00:00                Iter   2    5.63%  Refine tree2775 MB(89%)00:00:00                Iter   2    7.04%  Refine tree2775 MB(89%)00:00:00                Iter   2    8.45%  Refine tree2775 MB(89%)00:00:00                Iter   2    9.86%  Refine tree2775 MB(89%)00:00:00                Iter   2   11.27%  Refine tree2775 MB(89%)00:00:00                Iter   2   12.68%  Refine tree2775 MB(89%)00:00:00                Iter   2   14.08%  Refine tree2775 MB(89%)00:00:00                Iter   2   15.49%  Refine tree2775 MB(89%)00:00:00                Iter   2   16.90%  Refine tree2775 MB(89%)00:00:00                Iter   2   18.31%  Refine tree2775 MB(89%)00:00:00                Iter   2   19.72%  Refine tree2775 MB(89%)00:00:00                Iter   2   21.13%  Refine tree2775 MB(89%)00:00:00                Iter   2   22.54%  Refine tree2775 MB(89%)00:00:00                Iter   2   23.94%  Refine tree2775 MB(89%)00:00:00                Iter   2   25.35%  Refine tree2775 MB(89%)00:00:00                Iter   2   26.76%  Refine tree2775 MB(89%)00:00:00                Iter   2   28.17%  Refine tree2775 MB(89%)00:00:00                Iter   2   29.58%  Refine tree2775 MB(89%)00:00:00                Iter   2   30.99%  Refine tree2775 MB(89%)00:00:00                Iter   2   32.39%  Refine tree2775 MB(89%)00:00:00                Iter   2   33.80%  Refine tree2775 MB(89%)00:00:00                Iter   2   35.21%  Refine tree2775 MB(89%)00:00:00                Iter   2   36.62%  Refine tree2775 MB(89%)00:00:00                Iter   2   38.03%  Refine tree2775 MB(89%)00:00:00                Iter   2   39.44%  Refine tree2775 MB(89%)00:00:00                Iter   2   40.85%  Refine tree2775 MB(89%)00:00:00                Iter   2   42.25%  Refine tree2775 MB(89%)00:00:00                Iter   2   43.66%  Refine tree2775 MB(89%)00:00:00                Iter   2   45.07%  Refine tree2775 MB(89%)00:00:00                Iter   2   46.48%  Refine tree2775 MB(89%)00:00:00                Iter   2   47.89%  Refine tree2775 MB(89%)00:00:00                Iter   2   49.30%  Refine tree2775 MB(89%)00:00:00                Iter   2   50.70%  Refine tree2775 MB(89%)00:00:00                Iter   2   52.11%  Refine tree2775 MB(89%)00:00:00                Iter   2   53.52%  Refine tree2775 MB(89%)00:00:00                Iter   2   54.93%  Refine tree2775 MB(89%)00:00:00                Iter   2   56.34%  Refine tree2775 MB(89%)00:00:00                Iter   2   57.75%  Refine tree2775 MB(89%)00:00:00                Iter   2   59.15%  Refine tree2775 MB(89%)00:00:00                Iter   2   60.56%  Refine tree2775 MB(89%)00:00:00                Iter   2   61.97%  Refine tree2775 MB(89%)00:00:00                Iter   2   63.38%  Refine tree2775 MB(89%)00:00:00                Iter   2   64.79%  Refine tree2775 MB(89%)00:00:00                Iter   2   66.20%  Refine tree2775 MB(89%)00:00:00                Iter   2   67.61%  Refine tree2775 MB(89%)00:00:00                Iter   2   69.01%  Refine tree2775 MB(89%)00:00:00                Iter   2   70.42%  Refine tree2775 MB(89%)00:00:00                Iter   2   71.83%  Refine tree2775 MB(89%)00:00:00                Iter   2   73.24%  Refine tree2775 MB(89%)00:00:00                Iter   2  100.00%  Refine tree
## 2775 MB(89%)00:00:00                Iter   2    1.37%  Root alignment2775 MB(89%)00:00:00                Iter   2    2.74%  Root alignment2775 MB(89%)00:00:00                Iter   2    4.11%  Root alignment2775 MB(89%)00:00:00                Iter   2    5.48%  Root alignment2775 MB(89%)00:00:00                Iter   2    6.85%  Root alignment2775 MB(89%)00:00:00                Iter   2    8.22%  Root alignment2775 MB(89%)00:00:00                Iter   2    9.59%  Root alignment2775 MB(89%)00:00:00                Iter   2   10.96%  Root alignment2775 MB(89%)00:00:00                Iter   2   12.33%  Root alignment2775 MB(89%)00:00:00                Iter   2   13.70%  Root alignment2775 MB(89%)00:00:00                Iter   2   15.07%  Root alignment2775 MB(89%)00:00:00                Iter   2   16.44%  Root alignment2775 MB(89%)00:00:00                Iter   2   17.81%  Root alignment2775 MB(89%)00:00:00                Iter   2   19.18%  Root alignment2775 MB(89%)00:00:00                Iter   2   20.55%  Root alignment2775 MB(89%)00:00:00                Iter   2   21.92%  Root alignment2775 MB(89%)00:00:00                Iter   2   23.29%  Root alignment2775 MB(89%)00:00:00                Iter   2   24.66%  Root alignment2775 MB(89%)00:00:00                Iter   2   26.03%  Root alignment2775 MB(89%)00:00:00                Iter   2   27.40%  Root alignment2775 MB(89%)00:00:00                Iter   2   28.77%  Root alignment2775 MB(89%)00:00:00                Iter   2   30.14%  Root alignment2775 MB(89%)00:00:00                Iter   2   31.51%  Root alignment2775 MB(89%)00:00:00                Iter   2   32.88%  Root alignment2775 MB(89%)00:00:00                Iter   2   34.25%  Root alignment2775 MB(89%)00:00:00                Iter   2   35.62%  Root alignment2775 MB(89%)00:00:00                Iter   2   36.99%  Root alignment2775 MB(89%)00:00:00                Iter   2   38.36%  Root alignment2775 MB(89%)00:00:00                Iter   2   39.73%  Root alignment2775 MB(89%)00:00:00                Iter   2   41.10%  Root alignment2775 MB(89%)00:00:00                Iter   2   42.47%  Root alignment2775 MB(89%)00:00:00                Iter   2   43.84%  Root alignment2775 MB(89%)00:00:00                Iter   2   45.21%  Root alignment2775 MB(89%)00:00:00                Iter   2   46.58%  Root alignment2775 MB(89%)00:00:00                Iter   2   47.95%  Root alignment2775 MB(89%)00:00:00                Iter   2   49.32%  Root alignment2775 MB(89%)00:00:00                Iter   2   50.68%  Root alignment2775 MB(89%)00:00:00                Iter   2   52.05%  Root alignment2775 MB(89%)00:00:00                Iter   2   53.42%  Root alignment2775 MB(89%)00:00:00                Iter   2   54.79%  Root alignment2775 MB(89%)00:00:00                Iter   2   56.16%  Root alignment2775 MB(89%)00:00:00                Iter   2   57.53%  Root alignment2775 MB(89%)00:00:00                Iter   2   58.90%  Root alignment2775 MB(89%)00:00:00                Iter   2   60.27%  Root alignment2775 MB(89%)00:00:00                Iter   2   61.64%  Root alignment2775 MB(89%)00:00:00                Iter   2   63.01%  Root alignment2775 MB(89%)00:00:00                Iter   2   64.38%  Root alignment2775 MB(89%)00:00:00                Iter   2   65.75%  Root alignment2775 MB(89%)00:00:00                Iter   2   67.12%  Root alignment2775 MB(89%)00:00:00                Iter   2   68.49%  Root alignment2775 MB(89%)00:00:00                Iter   2   69.86%  Root alignment2775 MB(89%)00:00:00                Iter   2   71.23%  Root alignment2775 MB(89%)00:00:00                Iter   2   72.60%  Root alignment2775 MB(89%)00:00:00                Iter   2   73.97%  Root alignment2775 MB(89%)00:00:00                Iter   2   75.34%  Root alignment2775 MB(89%)00:00:00                Iter   2   76.71%  Root alignment2775 MB(89%)00:00:00                Iter   2   78.08%  Root alignment2775 MB(89%)00:00:00                Iter   2   79.45%  Root alignment2775 MB(89%)00:00:00                Iter   2   80.82%  Root alignment2775 MB(89%)00:00:00                Iter   2   82.19%  Root alignment2775 MB(89%)00:00:00                Iter   2   83.56%  Root alignment2775 MB(89%)00:00:00                Iter   2   84.93%  Root alignment2775 MB(89%)00:00:00                Iter   2   86.30%  Root alignment2775 MB(89%)00:00:00                Iter   2   87.67%  Root alignment2775 MB(89%)00:00:00                Iter   2   89.04%  Root alignment2775 MB(89%)00:00:00                Iter   2   90.41%  Root alignment2775 MB(89%)00:00:00                Iter   2   91.78%  Root alignment2775 MB(89%)00:00:00                Iter   2   93.15%  Root alignment2775 MB(89%)00:00:00                Iter   2   94.52%  Root alignment2775 MB(89%)00:00:00                Iter   2   95.89%  Root alignment2775 MB(89%)00:00:00                Iter   2   97.26%  Root alignment2775 MB(89%)00:00:00                Iter   2   98.63%  Root alignment2775 MB(89%)00:00:00                Iter   2  100.00%  Root alignment2775 MB(89%)00:00:00                Iter   2  100.00%  Root alignment
## 2775 MB(89%)00:00:00                Iter   2  100.00%  Root alignment
## 2775 MB(89%)00:00:00                Iter   3    1.40%  Refine biparts2775 MB(89%)00:00:00                Iter   3    2.10%  Refine biparts2775 MB(89%)00:00:00                Iter   3    2.80%  Refine biparts2775 MB(89%)00:00:00                Iter   3    3.50%  Refine biparts2775 MB(89%)00:00:00                Iter   3    4.20%  Refine biparts2775 MB(89%)00:00:00                Iter   3    4.90%  Refine biparts2775 MB(89%)00:00:00                Iter   3    5.59%  Refine biparts2775 MB(89%)00:00:00                Iter   3    6.29%  Refine biparts2775 MB(89%)00:00:00                Iter   3    6.99%  Refine biparts2775 MB(89%)00:00:00                Iter   3    7.69%  Refine biparts2775 MB(89%)00:00:00                Iter   3    8.39%  Refine biparts2775 MB(89%)00:00:00                Iter   3    9.09%  Refine biparts2775 MB(89%)00:00:00                Iter   3    9.79%  Refine biparts2775 MB(89%)00:00:00                Iter   3   10.49%  Refine biparts2775 MB(89%)00:00:00                Iter   3   11.19%  Refine biparts2775 MB(89%)00:00:00                Iter   3   11.89%  Refine biparts2775 MB(89%)00:00:00                Iter   3   12.59%  Refine biparts2775 MB(89%)00:00:00                Iter   3   13.29%  Refine biparts2775 MB(89%)00:00:00                Iter   3   13.99%  Refine biparts2775 MB(89%)00:00:00                Iter   3   14.69%  Refine biparts2775 MB(89%)00:00:00                Iter   3   15.38%  Refine biparts2775 MB(89%)00:00:00                Iter   3   16.08%  Refine biparts2775 MB(89%)00:00:01                Iter   3   16.78%  Refine biparts2775 MB(89%)00:00:01                Iter   3   17.48%  Refine biparts2775 MB(89%)00:00:01                Iter   3   18.18%  Refine biparts2775 MB(89%)00:00:01                Iter   3   18.88%  Refine biparts2775 MB(89%)00:00:01                Iter   3   19.58%  Refine biparts2775 MB(89%)00:00:01                Iter   3   20.28%  Refine biparts2775 MB(89%)00:00:01                Iter   3   20.98%  Refine biparts2775 MB(89%)00:00:01                Iter   3   21.68%  Refine biparts2775 MB(89%)00:00:01                Iter   3   22.38%  Refine biparts2775 MB(89%)00:00:01                Iter   3   23.08%  Refine biparts2775 MB(89%)00:00:01                Iter   3   23.78%  Refine biparts2775 MB(89%)00:00:01                Iter   3   24.48%  Refine biparts2775 MB(89%)00:00:01                Iter   3   25.17%  Refine biparts2775 MB(89%)00:00:01                Iter   3   25.87%  Refine biparts2775 MB(89%)00:00:01                Iter   3   26.57%  Refine biparts2775 MB(89%)00:00:01                Iter   3   27.27%  Refine biparts2775 MB(89%)00:00:01                Iter   3   27.97%  Refine biparts2775 MB(89%)00:00:01                Iter   3   28.67%  Refine biparts2775 MB(89%)00:00:01                Iter   3   29.37%  Refine biparts2775 MB(89%)00:00:01                Iter   3   30.07%  Refine biparts2775 MB(89%)00:00:01                Iter   3   30.77%  Refine biparts2775 MB(89%)00:00:01                Iter   3   31.47%  Refine biparts2775 MB(89%)00:00:01                Iter   3   32.17%  Refine biparts2775 MB(89%)00:00:01                Iter   3   32.87%  Refine biparts2775 MB(89%)00:00:01                Iter   3   33.57%  Refine biparts2775 MB(89%)00:00:01                Iter   3   34.27%  Refine biparts2775 MB(89%)00:00:01                Iter   3   34.97%  Refine biparts2775 MB(89%)00:00:01                Iter   3   35.66%  Refine biparts2775 MB(89%)00:00:01                Iter   3   36.36%  Refine biparts2775 MB(89%)00:00:01                Iter   3   37.06%  Refine biparts2775 MB(89%)00:00:01                Iter   3   37.76%  Refine biparts2775 MB(89%)00:00:01                Iter   3   38.46%  Refine biparts2775 MB(89%)00:00:01                Iter   3   39.16%  Refine biparts2775 MB(89%)00:00:01                Iter   3   39.86%  Refine biparts2775 MB(89%)00:00:01                Iter   3   40.56%  Refine biparts2775 MB(89%)00:00:01                Iter   3   41.26%  Refine biparts2775 MB(89%)00:00:01                Iter   3   41.96%  Refine biparts2775 MB(89%)00:00:01                Iter   3   42.66%  Refine biparts2775 MB(89%)00:00:01                Iter   3   43.36%  Refine biparts2775 MB(89%)00:00:01                Iter   3   44.06%  Refine biparts2775 MB(89%)00:00:01                Iter   3   44.76%  Refine biparts2775 MB(89%)00:00:01                Iter   3   45.45%  Refine biparts2775 MB(89%)00:00:01                Iter   3   46.15%  Refine biparts2775 MB(89%)00:00:01                Iter   3   46.85%  Refine biparts2775 MB(89%)00:00:01                Iter   3   47.55%  Refine biparts2775 MB(89%)00:00:01                Iter   3   48.25%  Refine biparts2775 MB(89%)00:00:01                Iter   3   48.95%  Refine biparts2775 MB(89%)00:00:01                Iter   3   49.65%  Refine biparts2775 MB(89%)00:00:01                Iter   3   50.35%  Refine biparts2775 MB(89%)00:00:01                Iter   3   51.05%  Refine biparts2775 MB(89%)00:00:01                Iter   3   51.75%  Refine biparts2775 MB(89%)00:00:01                Iter   3   52.45%  Refine biparts2775 MB(89%)00:00:01                Iter   3   53.15%  Refine biparts2775 MB(89%)00:00:01                Iter   3   53.85%  Refine biparts2775 MB(89%)00:00:01                Iter   3   54.55%  Refine biparts2775 MB(89%)00:00:01                Iter   3   55.24%  Refine biparts2775 MB(89%)00:00:01                Iter   3   55.94%  Refine biparts2775 MB(89%)00:00:01                Iter   3   56.64%  Refine biparts2775 MB(89%)00:00:01                Iter   3   57.34%  Refine biparts2775 MB(89%)00:00:01                Iter   3   58.04%  Refine biparts2775 MB(89%)00:00:01                Iter   3   58.74%  Refine biparts2775 MB(89%)00:00:01                Iter   3   59.44%  Refine biparts2775 MB(89%)00:00:01                Iter   3   60.14%  Refine biparts2775 MB(89%)00:00:01                Iter   3   60.84%  Refine biparts2775 MB(89%)00:00:01                Iter   3   61.54%  Refine biparts2775 MB(89%)00:00:01                Iter   3   62.24%  Refine biparts2775 MB(89%)00:00:01                Iter   3   62.94%  Refine biparts2775 MB(89%)00:00:01                Iter   3   63.64%  Refine biparts2775 MB(89%)00:00:01                Iter   3   64.34%  Refine biparts2775 MB(89%)00:00:01                Iter   3   65.03%  Refine biparts2775 MB(89%)00:00:01                Iter   3   65.73%  Refine biparts2775 MB(89%)00:00:01                Iter   3   66.43%  Refine biparts2775 MB(89%)00:00:01                Iter   3   67.13%  Refine biparts2775 MB(89%)00:00:01                Iter   3   67.83%  Refine biparts2775 MB(89%)00:00:01                Iter   3   68.53%  Refine biparts2775 MB(89%)00:00:01                Iter   3   69.23%  Refine biparts2775 MB(89%)00:00:01                Iter   3   69.93%  Refine biparts2775 MB(89%)00:00:01                Iter   3   70.63%  Refine biparts2775 MB(89%)00:00:01                Iter   3   71.33%  Refine biparts2775 MB(89%)00:00:01                Iter   3   72.03%  Refine biparts2775 MB(89%)00:00:01                Iter   3   72.73%  Refine biparts2775 MB(89%)00:00:01                Iter   3   73.43%  Refine biparts2775 MB(89%)00:00:01                Iter   3   74.13%  Refine biparts2775 MB(89%)00:00:01                Iter   3   74.83%  Refine biparts2775 MB(89%)00:00:01                Iter   3   75.52%  Refine biparts2775 MB(89%)00:00:01                Iter   3   76.22%  Refine biparts2775 MB(89%)00:00:01                Iter   3   76.92%  Refine biparts2775 MB(89%)00:00:01                Iter   3   77.62%  Refine biparts2775 MB(89%)00:00:01                Iter   3   78.32%  Refine biparts2775 MB(89%)00:00:01                Iter   3   79.02%  Refine biparts2775 MB(89%)00:00:01                Iter   3   79.72%  Refine biparts2775 MB(89%)00:00:01                Iter   3   80.42%  Refine biparts2775 MB(89%)00:00:01                Iter   3   81.12%  Refine biparts2775 MB(89%)00:00:01                Iter   3   81.82%  Refine biparts2775 MB(89%)00:00:01                Iter   3   82.52%  Refine biparts2775 MB(89%)00:00:01                Iter   3   83.22%  Refine biparts2775 MB(89%)00:00:01                Iter   3   83.92%  Refine biparts2775 MB(89%)00:00:01                Iter   3   84.62%  Refine biparts2775 MB(89%)00:00:01                Iter   3   85.31%  Refine biparts2775 MB(89%)00:00:01                Iter   3   86.01%  Refine biparts2775 MB(89%)00:00:01                Iter   3   86.71%  Refine biparts2775 MB(89%)00:00:01                Iter   3   87.41%  Refine biparts2775 MB(89%)00:00:01                Iter   3   88.11%  Refine biparts2775 MB(89%)00:00:01                Iter   3   88.81%  Refine biparts2775 MB(89%)00:00:01                Iter   3   89.51%  Refine biparts2775 MB(89%)00:00:01                Iter   3   90.21%  Refine biparts2775 MB(89%)00:00:01                Iter   3   90.91%  Refine biparts2775 MB(89%)00:00:01                Iter   3   91.61%  Refine biparts2775 MB(89%)00:00:01                Iter   3   92.31%  Refine biparts2775 MB(89%)00:00:01                Iter   3   93.01%  Refine biparts2775 MB(89%)00:00:01                Iter   3   93.71%  Refine biparts2775 MB(89%)00:00:01                Iter   3   94.41%  Refine biparts2775 MB(89%)00:00:01                Iter   3   95.10%  Refine biparts2775 MB(89%)00:00:01                Iter   3   95.80%  Refine biparts2775 MB(89%)00:00:01                Iter   3   96.50%  Refine biparts2775 MB(89%)00:00:01                Iter   3   97.20%  Refine biparts2775 MB(89%)00:00:01                Iter   3   97.90%  Refine biparts2775 MB(89%)00:00:01                Iter   3   98.60%  Refine biparts2775 MB(89%)00:00:01                Iter   3   99.30%  Refine biparts2775 MB(89%)00:00:01                Iter   3  100.00%  Refine biparts2775 MB(89%)00:00:01                Iter   3  100.70%  Refine biparts2937 MB(95%)00:00:01                Iter   3  100.00%  Refine biparts
## 2937 MB(95%)00:00:01                Iter   4    1.40%  Refine biparts2937 MB(95%)00:00:01                Iter   4    2.10%  Refine biparts2937 MB(95%)00:00:01                Iter   4    2.80%  Refine biparts2937 MB(95%)00:00:01                Iter   4    3.50%  Refine biparts2937 MB(95%)00:00:01                Iter   4    4.20%  Refine biparts2937 MB(95%)00:00:01                Iter   4    4.90%  Refine biparts2937 MB(95%)00:00:01                Iter   4    5.59%  Refine biparts2937 MB(95%)00:00:01                Iter   4    6.29%  Refine biparts2937 MB(95%)00:00:01                Iter   4    6.99%  Refine biparts2937 MB(95%)00:00:02                Iter   4    7.69%  Refine biparts2937 MB(95%)00:00:02                Iter   4    8.39%  Refine biparts2937 MB(95%)00:00:02                Iter   4    9.09%  Refine biparts2937 MB(95%)00:00:02                Iter   4    9.79%  Refine biparts2937 MB(95%)00:00:02                Iter   4   10.49%  Refine biparts2937 MB(95%)00:00:02                Iter   4   11.19%  Refine biparts2937 MB(95%)00:00:02                Iter   4   11.89%  Refine biparts2937 MB(95%)00:00:02                Iter   4   12.59%  Refine biparts2937 MB(95%)00:00:02                Iter   4   13.29%  Refine biparts2937 MB(95%)00:00:02                Iter   4   13.99%  Refine biparts2937 MB(95%)00:00:02                Iter   4   14.69%  Refine biparts2937 MB(95%)00:00:02                Iter   4   15.38%  Refine biparts2937 MB(95%)00:00:02                Iter   4   16.08%  Refine biparts2937 MB(95%)00:00:02                Iter   4   16.78%  Refine biparts2937 MB(95%)00:00:02                Iter   4   17.48%  Refine biparts2937 MB(95%)00:00:02                Iter   4   18.18%  Refine biparts2937 MB(95%)00:00:02                Iter   4   18.88%  Refine biparts2937 MB(95%)00:00:02                Iter   4   19.58%  Refine biparts2937 MB(95%)00:00:02                Iter   4   20.28%  Refine biparts2937 MB(95%)00:00:02                Iter   4   20.98%  Refine biparts2937 MB(95%)00:00:02                Iter   4   21.68%  Refine biparts2937 MB(95%)00:00:02                Iter   4   22.38%  Refine biparts2937 MB(95%)00:00:02                Iter   4   23.08%  Refine biparts2937 MB(95%)00:00:02                Iter   4   23.78%  Refine biparts2937 MB(95%)00:00:02                Iter   4   24.48%  Refine biparts2937 MB(95%)00:00:02                Iter   4   25.17%  Refine biparts2937 MB(95%)00:00:02                Iter   4   25.87%  Refine biparts2937 MB(95%)00:00:02                Iter   4   26.57%  Refine biparts2937 MB(95%)00:00:02                Iter   4   27.27%  Refine biparts2937 MB(95%)00:00:02                Iter   4   27.97%  Refine biparts2937 MB(95%)00:00:02                Iter   4   28.67%  Refine biparts2937 MB(95%)00:00:02                Iter   4   29.37%  Refine biparts2937 MB(95%)00:00:02                Iter   4   30.07%  Refine biparts2937 MB(95%)00:00:02                Iter   4   30.77%  Refine biparts2937 MB(95%)00:00:02                Iter   4   31.47%  Refine biparts2937 MB(95%)00:00:02                Iter   4   32.17%  Refine biparts2937 MB(95%)00:00:02                Iter   4   32.87%  Refine biparts2937 MB(95%)00:00:02                Iter   4   33.57%  Refine biparts2937 MB(95%)00:00:02                Iter   4   34.27%  Refine biparts2937 MB(95%)00:00:02                Iter   4   34.97%  Refine biparts2937 MB(95%)00:00:02                Iter   4   35.66%  Refine biparts2937 MB(95%)00:00:02                Iter   4   36.36%  Refine biparts2937 MB(95%)00:00:02                Iter   4   37.06%  Refine biparts2937 MB(95%)00:00:02                Iter   4   37.76%  Refine biparts2937 MB(95%)00:00:02                Iter   4   38.46%  Refine biparts2937 MB(95%)00:00:02                Iter   4   39.16%  Refine biparts2937 MB(95%)00:00:02                Iter   4   39.86%  Refine biparts2937 MB(95%)00:00:02                Iter   4   40.56%  Refine biparts2937 MB(95%)00:00:02                Iter   4   41.26%  Refine biparts2937 MB(95%)00:00:02                Iter   4   41.96%  Refine biparts2937 MB(95%)00:00:02                Iter   4   42.66%  Refine biparts2937 MB(95%)00:00:02                Iter   4   43.36%  Refine biparts2937 MB(95%)00:00:02                Iter   4   44.06%  Refine biparts2937 MB(95%)00:00:02                Iter   4   44.76%  Refine biparts2937 MB(95%)00:00:02                Iter   4   45.45%  Refine biparts2937 MB(95%)00:00:02                Iter   4   46.15%  Refine biparts2937 MB(95%)00:00:02                Iter   4   46.85%  Refine biparts2937 MB(95%)00:00:02                Iter   4   47.55%  Refine biparts2937 MB(95%)00:00:02                Iter   4   48.25%  Refine biparts2937 MB(95%)00:00:02                Iter   4   48.95%  Refine biparts2937 MB(95%)00:00:02                Iter   4   49.65%  Refine biparts2937 MB(95%)00:00:02                Iter   4   50.35%  Refine biparts2937 MB(95%)00:00:02                Iter   4   51.05%  Refine biparts2937 MB(95%)00:00:02                Iter   4   51.75%  Refine biparts2937 MB(95%)00:00:02                Iter   4   52.45%  Refine biparts2937 MB(95%)00:00:02                Iter   4   53.15%  Refine biparts2937 MB(95%)00:00:02                Iter   4   53.85%  Refine biparts2937 MB(95%)00:00:02                Iter   4   54.55%  Refine biparts2937 MB(95%)00:00:02                Iter   4   55.24%  Refine biparts2937 MB(95%)00:00:02                Iter   4   55.94%  Refine biparts2937 MB(95%)00:00:02                Iter   4   56.64%  Refine biparts2937 MB(95%)00:00:02                Iter   4   57.34%  Refine biparts2937 MB(95%)00:00:02                Iter   4   58.04%  Refine biparts2937 MB(95%)00:00:02                Iter   4   58.74%  Refine biparts2937 MB(95%)00:00:02                Iter   4   59.44%  Refine biparts2937 MB(95%)00:00:02                Iter   4   60.14%  Refine biparts2937 MB(95%)00:00:02                Iter   4   60.84%  Refine biparts2937 MB(95%)00:00:02                Iter   4   61.54%  Refine biparts2937 MB(95%)00:00:02                Iter   4   62.24%  Refine biparts2937 MB(95%)00:00:02                Iter   4   62.94%  Refine biparts2937 MB(95%)00:00:02                Iter   4   63.64%  Refine biparts2937 MB(95%)00:00:02                Iter   4   64.34%  Refine biparts2937 MB(95%)00:00:02                Iter   4   65.03%  Refine biparts2937 MB(95%)00:00:02                Iter   4   65.73%  Refine biparts2937 MB(95%)00:00:02                Iter   4   66.43%  Refine biparts2937 MB(95%)00:00:02                Iter   4   67.13%  Refine biparts2937 MB(95%)00:00:02                Iter   4   67.83%  Refine biparts2937 MB(95%)00:00:02                Iter   4   68.53%  Refine biparts2937 MB(95%)00:00:02                Iter   4   69.23%  Refine biparts2937 MB(95%)00:00:02                Iter   4   69.93%  Refine biparts2937 MB(95%)00:00:02                Iter   4   70.63%  Refine biparts2937 MB(95%)00:00:02                Iter   4   71.33%  Refine biparts2937 MB(95%)00:00:02                Iter   4   72.03%  Refine biparts2937 MB(95%)00:00:02                Iter   4   72.73%  Refine biparts2937 MB(95%)00:00:02                Iter   4   73.43%  Refine biparts2937 MB(95%)00:00:02                Iter   4   74.13%  Refine biparts2937 MB(95%)00:00:02                Iter   4   74.83%  Refine biparts2937 MB(95%)00:00:02                Iter   4   75.52%  Refine biparts2937 MB(95%)00:00:02                Iter   4   76.22%  Refine biparts2937 MB(95%)00:00:02                Iter   4   76.92%  Refine biparts2937 MB(95%)00:00:02                Iter   4   77.62%  Refine biparts2937 MB(95%)00:00:02                Iter   4   78.32%  Refine biparts2937 MB(95%)00:00:02                Iter   4   79.02%  Refine biparts2937 MB(95%)00:00:02                Iter   4   79.72%  Refine biparts2937 MB(95%)00:00:02                Iter   4   80.42%  Refine biparts2937 MB(95%)00:00:02                Iter   4   81.12%  Refine biparts2937 MB(95%)00:00:02                Iter   4   81.82%  Refine biparts2937 MB(95%)00:00:02                Iter   4   82.52%  Refine biparts2937 MB(95%)00:00:02                Iter   4   83.22%  Refine biparts2937 MB(95%)00:00:02                Iter   4   83.92%  Refine biparts2937 MB(95%)00:00:02                Iter   4   84.62%  Refine biparts2937 MB(95%)00:00:02                Iter   4   85.31%  Refine biparts2937 MB(95%)00:00:02                Iter   4   86.01%  Refine biparts2937 MB(95%)00:00:02                Iter   4   86.71%  Refine biparts2937 MB(95%)00:00:02                Iter   4   87.41%  Refine biparts2937 MB(95%)00:00:02                Iter   4   88.11%  Refine biparts2937 MB(95%)00:00:02                Iter   4   88.81%  Refine biparts2937 MB(95%)00:00:02                Iter   4   89.51%  Refine biparts2937 MB(95%)00:00:02                Iter   4   90.21%  Refine biparts2937 MB(95%)00:00:02                Iter   4   90.91%  Refine biparts2937 MB(95%)00:00:02                Iter   4   91.61%  Refine biparts2937 MB(95%)00:00:02                Iter   4   92.31%  Refine biparts2937 MB(95%)00:00:02                Iter   4   93.01%  Refine biparts2937 MB(95%)00:00:02                Iter   4   93.71%  Refine biparts2937 MB(95%)00:00:02                Iter   4   94.41%  Refine biparts2937 MB(95%)00:00:02                Iter   4   95.10%  Refine biparts2937 MB(95%)00:00:02                Iter   4   95.80%  Refine biparts2937 MB(95%)00:00:02                Iter   4   96.50%  Refine biparts2937 MB(95%)00:00:02                Iter   4   97.20%  Refine biparts2937 MB(95%)00:00:02                Iter   4   97.90%  Refine biparts2937 MB(95%)00:00:02                Iter   4   98.60%  Refine biparts2937 MB(95%)00:00:02                Iter   4   99.30%  Refine biparts2937 MB(95%)00:00:02                Iter   4  100.00%  Refine biparts2937 MB(95%)00:00:02                Iter   4  100.70%  Refine biparts2937 MB(95%)00:00:02                Iter   4  100.00%  Refine biparts
## 2937 MB(95%)00:00:02                Iter   5    1.40%  Refine biparts2937 MB(95%)00:00:02                Iter   5    2.10%  Refine biparts2937 MB(95%)00:00:02                Iter   5    2.80%  Refine biparts2937 MB(95%)00:00:02                Iter   5    3.50%  Refine biparts2937 MB(95%)00:00:02                Iter   5    4.20%  Refine biparts2937 MB(95%)00:00:02                Iter   5    4.90%  Refine biparts2937 MB(95%)00:00:02                Iter   5    5.59%  Refine biparts2937 MB(95%)00:00:02                Iter   5    6.29%  Refine biparts2937 MB(95%)00:00:02                Iter   5    6.99%  Refine biparts2937 MB(95%)00:00:02                Iter   5    7.69%  Refine biparts2937 MB(95%)00:00:02                Iter   5    8.39%  Refine biparts2937 MB(95%)00:00:02                Iter   5    9.09%  Refine biparts2937 MB(95%)00:00:02                Iter   5    9.79%  Refine biparts2937 MB(95%)00:00:02                Iter   5   10.49%  Refine biparts2937 MB(95%)00:00:02                Iter   5   11.19%  Refine biparts2937 MB(95%)00:00:02                Iter   5   11.89%  Refine biparts2937 MB(95%)00:00:02                Iter   5   12.59%  Refine biparts2937 MB(95%)00:00:02                Iter   5   13.29%  Refine biparts2937 MB(95%)00:00:02                Iter   5   13.99%  Refine biparts2937 MB(95%)00:00:02                Iter   5   14.69%  Refine biparts2937 MB(95%)00:00:02                Iter   5   15.38%  Refine biparts2937 MB(95%)00:00:02                Iter   5   16.08%  Refine biparts2937 MB(95%)00:00:02                Iter   5   16.78%  Refine biparts2937 MB(95%)00:00:02                Iter   5   17.48%  Refine biparts2937 MB(95%)00:00:02                Iter   5   18.18%  Refine biparts2937 MB(95%)00:00:02                Iter   5  100.00%  Refine biparts
## 2937 MB(95%)00:00:02                Iter   5  100.00%  Refine biparts
# Convert aligned sequences to phyDat format for phylogenetic analysis
aligned_phyDat_unique <- as.phyDat(aligned_sequences_unique)

# Compute a distance matrix for phylogenetic tree construction
dist_matrix <- dist.ml(aligned_phyDat_unique, model = "JC69")

### Build a Neighbor-Joining tree from the distance matrix----
nj_tree <- nj(dist_matrix)

# Plot the NJ tree
plot(nj_tree, main = "Neighbor-Joining Phylogenetic Tree of Anolis species", 
     cex = 0.4, edge.width = 0.3, no.margin = TRUE)

### Maximum Likelihood tree optimization----
ml_tree <- pml(nj_tree, aligned_phyDat_unique)
ml_tree_optimized <- optim.pml(ml_tree, model = "GTR", optInv = TRUE, optGamma = TRUE)
## only one rate class, ignored optGamma
## optimize edge weights:  -10039.79 --> -9918.345 
## optimize rate matrix:  -9918.345 --> -9498.801 
## optimize invariant sites:  -9498.801 --> -8469.767 
## optimize edge weights:  -8469.767 --> -8462.715 
## optimize rate matrix:  -8462.715 --> -8454.836 
## optimize invariant sites:  -8454.836 --> -8454.836 
## optimize edge weights:  -8454.836 --> -8454.689 
## optimize rate matrix:  -8454.689 --> -8454.685 
## optimize invariant sites:  -8454.685 --> -8454.623 
## optimize edge weights:  -8454.623 --> -8454.571 
## optimize rate matrix:  -8454.571 --> -8454.571 
## optimize invariant sites:  -8454.571 --> -8454.531 
## optimize edge weights:  -8454.531 --> -8454.5 
## optimize rate matrix:  -8454.5 --> -8454.5 
## optimize invariant sites:  -8454.5 --> -8454.476 
## optimize edge weights:  -8454.476 --> -8454.457 
## optimize rate matrix:  -8454.457 --> -8454.457 
## optimize invariant sites:  -8454.457 --> -8454.443 
## optimize edge weights:  -8454.443 --> -8454.432 
## optimize rate matrix:  -8454.432 --> -8454.432 
## optimize invariant sites:  -8454.432 --> -8454.424 
## optimize edge weights:  -8454.424 --> -8454.417 
## optimize rate matrix:  -8454.417 --> -8454.417 
## optimize invariant sites:  -8454.417 --> -8454.412 
## optimize edge weights:  -8454.412 --> -8454.408 
## optimize rate matrix:  -8454.408 --> -8454.408 
## optimize invariant sites:  -8454.408 --> -8454.405 
## optimize edge weights:  -8454.405 --> -8454.403 
## optimize rate matrix:  -8454.403 --> -8454.403 
## optimize invariant sites:  -8454.403 --> -8454.402 
## optimize edge weights:  -8454.402 --> -8454.4 
## optimize rate matrix:  -8454.4 --> -8454.4 
## optimize invariant sites:  -8454.4 --> -8454.399 
## optimize edge weights:  -8454.399 --> -8454.398
# Print summary of the optimized Maximum Likelihood tree
#summary(ml_tree_optimized)

# Plot the optimized tree with adjusted graphical parameters
par(mar = c(1, 1, 1, 1))  # Adjust margins
plot(ml_tree_optimized$tree, cex = 0.4)  # Scale tree label size

After the construction of the phylogenetic trees , the code moves forward on showing the geographical spread of the species belonging to Anolis Genus, in order to understand where the actual focus should be put and whether results of this geographic spread plotted using the occurance data retrieved from the GBIF really aligns with the Community structure or not.Also from the Map retieved that most of species occurence is in the areas/countries around the south America only.

# 6. GEOGRAPHIC DISTRIBUTION ANALYSIS----
# Research Questions: Species distribution across communities
# Find species common to both occurrence and sequence data
species_occurrence <- unique(occurrence_data$species)
species_sequences <- unique(dfAnolis_16S_unique$Species_Name)
common_species <- intersect(species_occurrence, species_sequences)

# Filter occurrence and sequence data to only include common species
occurrence_common <- occurrence_data %>% 
  filter(species %in% common_species)

sequence_common <- dfAnolis_16S_unique %>% 
  filter(Species_Name %in% common_species)

# Check results for common species
print(common_species)
##  [1] "Anolis carolinensis" "Anolis sagrei"       "Anolis biporcatus"  
##  [4] "Anolis osa"          "Anolis limifrons"    "Anolis auratus"     
##  [7] "Anolis porcatus"     "Anolis polylepis"    "Anolis distichus"   
## [10] "Anolis grahami"      "Anolis bimaculatus"  "Anolis occultus"    
## [13] "Anolis oxylophus"    "Anolis anisolepis"   "Anolis pachypus"    
## [16] "Anolis humilis"      "Anolis gaigei"       "Anolis rodriguezii" 
## [19] "Anolis capito"
head(occurrence_common)
## # A tibble: 6 × 6
##   species             decimalLatitude decimalLongitude country     stateProvince
##   <chr>                         <dbl>            <dbl> <chr>       <chr>        
## 1 Anolis carolinensis            28.1            -82.6 United Sta… Florida      
## 2 Anolis carolinensis            32.6            -80.1 United Sta… South Caroli…
## 3 Anolis sagrei                  26.1            -80.1 United Sta… Florida      
## 4 Anolis biporcatus              16.8            -88.4 Belize      Stann Creek  
## 5 Anolis carolinensis            25.7            -80.3 United Sta… Florida      
## 6 Anolis sagrei                  27.8            -82.6 United Sta… Florida      
## # ℹ 1 more variable: iucnRedListCategory <chr>
head(sequence_common)
##                                                                                       Anolis16S_Title
## 1    MH140619.1 Anolis gaigei voucher CH 5426 16S ribosomal RNA gene, partial sequence; mitochondrial
## 2 MH140610.1 Anolis polylepis voucher CH 5687 16S ribosomal RNA gene, partial sequence; mitochondrial
## 3 MH140596.1 Anolis oxylophus voucher CH 6195 16S ribosomal RNA gene, partial sequence; mitochondrial
## 4 MH140573.1 Anolis limifrons voucher CH 5652 16S ribosomal RNA gene, partial sequence; mitochondrial
## 5   MH140561.1 Anolis humilis voucher CH 6552 16S ribosomal RNA gene, partial sequence; mitochondrial
## 6    MH140508.1 Anolis capito voucher CH 5926 16S ribosomal RNA gene, partial sequence; mitochondrial
##       Species_Name
## 1    Anolis gaigei
## 2 Anolis polylepis
## 3 Anolis oxylophus
## 4 Anolis limifrons
## 5   Anolis humilis
## 6    Anolis capito
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              Anolis_16S_Sequence
## 1 AGCCTTTAGCAAAACAAGTATTAAAGGTAACGCCTGCCCAGTGAAATTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATGAAGACCTGTATGAATGGCTATATGAGTATTTAACTGTCTCCTTTAACTAATCAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATCATCATAAGACGAGAAGACCCTGTGGAGCTTTAAATTTTTAACAAAGTATCACTAAACAGACGCTTATGATAAAAAATCTTTAGTTGGGGCGACTTTGGAGCAAAACTTAACCTCCAAGATAAAAGTACCACCTAATTTCAGGCTCACAAGCCGAACCTTATAGACCCAGTATTAAATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCAAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGAAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAACAGTCCT
## 2      AGCCTTTAGCAAAACAAGTATTAAAGGTAGCGCCTGCCCAGTGAAACTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAATATTTAACTGTCTCCTTTAACTAATCAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATAAAAACATAAGACGAGAAGACCCTGTGGAGCTTCAAATTTTAAACAAAATATAATAAAAATGTATTTATGTTAAAAATTTTTAGTTGGGGCGACTTTGGAGCAAAACTAAACCTCCAAGAAAAGGCACAGCCTAACTGAGACCAACAGGCCAAACCATAAAGACCCAGTATATTTACTGACTAATGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCAAGAGTTCCTATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCTAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAATAGTCCT
## 3    AGCCTTTAGCAAAACAAGTATTAAAGGTGATGCCTGCCCAGTGAAATTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATGAAGACCTGTATGAATGGCTATATGAGTATTTAACTGTCTCCTTTAACCAATCAGTGAAACTGATCTCTCAGTACAAAAGCTGAGATAAACACATAAGACGAGAAGACCCCGTGGAGCTTTAAATTTTTAACAACACACTACTTAAATATGTTTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGAAAAACTAATCCTCCAAGAAAAGGTACCGCCTACTTTAGGCCTACAAGCCAAACTACATAGACCCAGTATTAGATACTGATCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCAAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGTAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAATAGTCCT
## 4 AGCCTTTAGCAAAACTAGTATTAAAGGTAACGCCTGCCCAGTGAAATTTTAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATGAAGACCTGTATGAATGGCTAAATGAGTATTTTACTGTCTCCTTTAACTAATCAGTGAAACTGATCTTTCAGTACAAAAGCTGAAATATCTTCATAAGACGAGAAGACCCCGTGGAGCTTTAAACTTTTAACAGTATATCTCTAAAAGAGTATTTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGTAAAACTCAACCTCCAAGAAAAGGCAATGCCTGGCCTTAAGGCTCACAAGCCAAACCATATAGACCCAGTATATCTTACTGATTAACGAACCAAGTTACCCCGGGGATAACAGCGCCATCTTCTTCAAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAATAGTCCT
## 5        AGCCTTTAGCAAAACAAATATTAAAGGTAACGCCTGCCCAGTGAAACTTAAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAGTATTTAACTGTCTCCTTTAACTAATCAGTGAAACTGATCTCTCAGTACAAAAGCTGAGATAAGTTCATAAGACGAGAAGACCCTGTGGAGCTTTAAATTATTAACAAAACATAATAAAACATGTTTATGATAAAAAATTTTTAGTTGGGGCGACTTTGGAGTAAAACTAAGCCTCCAAGAAAAGGCATTGCCTAACCAAGGCTAACAGGCCAAACTAACCGACCCAGTATATTTACTGACCAACGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCAAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCTAATGGTGCAGCCGCTATTAAAGGTTCGTTTGTTCAACGATTAATGTCCT
## 6                             AGCCTTTAGCAAAACAAGTATTAAAGGTGACGCCTGCCCAGTGAAATTTTAACGGCCGCGGTATCCTAACCGTGCAAAGGTAGCGTAATCACTTGTCTTATAAATAAAGACCCGTATGAATGGCTAAATGAATATTTAACTGTCTCCTTTAACTAATCAGTGAAACTGATCCCTCAGTACAAAAGCTGAAATATAAACATAAGACGAGAAGACCCCGTGGAGCTTTAAACTTTTAACAAAATAGAACAAAACAGGTATTTATGATAAAAAGTTTTTAGTTGGGGCGACTTTGGAGTAAAACTGAACCTCCAAGAAAAGGCACCGCCTATTCCAGGCCAACAGGCCAAACTATAAAGACCCAGCACATAAACGCTGATCAATGAACCAAGTTACCCCAGGGATAACAGCGCCATCTTCTTCAAGAGTTCATATCGACAAGAAGGTTTACGACCTCGATGTTGGATCAGGACACCCAAATGGTGCAGCCGCTATTAAAGG
### Visualize species occurrence on a map using Plotly----
unique_countries <- unique(occurrence_common$country)
colors <- rainbow(length(unique_countries))
country_colors <- setNames(colors, unique_countries)
occurrence_common$color <- country_colors[occurrence_common$country]

# Create the interactive map
plot_ly(occurrence_common, 
        type = 'scattermapbox',  
        lat = ~decimalLatitude,  
        lon = ~decimalLongitude,  
        mode = 'markers',  
        marker = list(color = ~color, size = 8, opacity = 0.7),
        text = ~paste0("Species: ", species, "<br>",  
                       "Country: ", country, "<br>",  
                       "State/Province: ", stateProvince, "<br>",  
                       "Latitude: ", decimalLatitude, "<br>",  
                       "Longitude: ", decimalLongitude),  
        hoverinfo = 'text') %>%  
  layout(mapbox = list(style = 'open-street-map', 
                       zoom = 3, 
                       center = list(lat = 0, lon = -90), 
                       bearing = 0, 
                       pitch = 0),  
         margin = list(l = 0, r = 0, t = 0, b = 0))  # Remove margins

Moving ahead now the code will focus on the main element that is the Phylogenetic Community structure of Anolis Genus. This code chunk analyzes the phylogenetic community structure of Anolis species by combining species occurrence data with phylogenetic information. It creates a community matrix of species occurrences per country and prunes a phylogenetic tree to match the species in the matrix. It then calculates the mean pairwise phylogenetic distance (MPD) to measure relatedness within communities, visualizes this with a bar plot, and calculates Phylogenetic Diversity (PD) to assess evolutionary history. The analysis also assigns species to communities and visualizes the phylogenetic tree with these annotations. Null model testing is used to assess the significance of observed phylogenetic patterns. The results provide insights into the phylogenetic structure of species communities, identifying whether they are clustered, overdispersed, or randomly distributed, which is valuable for understanding biodiversity and evolutionary patterns in Anolis.

# 7. COMMUNITY STRUCTURE ANALYSIS----
# Research Questions: Species distribution across communities
# Find common species between occurrence and sequence data
# Combine occurrence data with phylogenetic information

### Step 1: Create the community matrix----
community_matrix <- occurrence_common %>%
  group_by(country, species) %>%
  summarise(n = n(), .groups = "drop") %>%  # Summarize the number of occurrences per species per country
  pivot_wider(names_from = species, values_from = n, values_fill = 0) %>% # Pivot data into a matrix format
  column_to_rownames(var = "country")  # Set country names as row names

# Check the resulting community matrix
head(community_matrix)  # View the first few rows of the matrix
##                                  Anolis biporcatus Anolis sagrei
## Belize                                           1             3
## Bonaire, Sint Eustatius and Saba                 0             0
## Brazil                                           0             0
## Cayman Islands                                   0             1
## Chinese Taipei                                   0             1
## Colombia                                         0             0
##                                  Anolis bimaculatus Anolis porcatus
## Belize                                            0               0
## Bonaire, Sint Eustatius and Saba                  1               0
## Brazil                                            0               1
## Cayman Islands                                    0               0
## Chinese Taipei                                    0               0
## Colombia                                          0               0
##                                  Anolis auratus Anolis gaigei Anolis capito
## Belize                                        0             0             0
## Bonaire, Sint Eustatius and Saba              0             0             0
## Brazil                                        0             0             0
## Cayman Islands                                0             0             0
## Chinese Taipei                                0             0             0
## Colombia                                      4             1             0
##                                  Anolis humilis Anolis limifrons Anolis osa
## Belize                                        0                0          0
## Bonaire, Sint Eustatius and Saba              0                0          0
## Brazil                                        0                0          0
## Cayman Islands                                0                0          0
## Chinese Taipei                                0                0          0
## Colombia                                      0                0          0
##                                  Anolis oxylophus Anolis pachypus
## Belize                                          0               0
## Bonaire, Sint Eustatius and Saba                0               0
## Brazil                                          0               0
## Cayman Islands                                  0               0
## Chinese Taipei                                  0               0
## Colombia                                        0               0
##                                  Anolis polylepis Anolis distichus
## Belize                                          0                0
## Bonaire, Sint Eustatius and Saba                0                0
## Brazil                                          0                0
## Cayman Islands                                  0                0
## Chinese Taipei                                  0                0
## Colombia                                        0                0
##                                  Anolis grahami Anolis anisolepis
## Belize                                        0                 0
## Bonaire, Sint Eustatius and Saba              0                 0
## Brazil                                        0                 0
## Cayman Islands                                0                 0
## Chinese Taipei                                0                 0
## Colombia                                      0                 0
##                                  Anolis rodriguezii Anolis occultus
## Belize                                            0               0
## Bonaire, Sint Eustatius and Saba                  0               0
## Brazil                                            0               0
## Cayman Islands                                    0               0
## Chinese Taipei                                    0               0
## Colombia                                          0               0
##                                  Anolis carolinensis
## Belize                                             0
## Bonaire, Sint Eustatius and Saba                   0
## Brazil                                             0
## Cayman Islands                                     0
## Chinese Taipei                                     0
## Colombia                                           0
### Step 2: Prepare data for phylogenetic community structure----
# Combine occurrence data with phylogenetic information, ensuring tree tip labels match community matrix species names
pruned_tree <- keep.tip(ml_tree_optimized$tree, colnames(community_matrix))  # Prune tree to match species in the community matrix

### Step 3: Calculate phylogenetic community structure metrics----
phylo_community_structure <- ses.mpd(community_matrix, cophenetic(pruned_tree))  # Calculate mean pairwise distance

# Remove rows with missing or invalid 'mpd.obs' values (e.g., negative distances)
phylo_community_structure_clean <- phylo_community_structure %>%
  filter(!is.na(mpd.obs) & mpd.obs >= 0)  # Filter out invalid data

### Step 4: Visualize the phylogenetic community structure with the clean data----
ggplot(phylo_community_structure_clean, aes(x = reorder(row.names(phylo_community_structure_clean), mpd.obs), y = mpd.obs)) +
  geom_bar(stat = "identity", fill = "skyblue") +  # Create bar plot for mean pairwise distance
  labs(title = "Mean Pairwise Phylogenetic Distance by Country",
       x = "Country", y = "MPD Observed") +  # Label axes
  theme_minimal() +  # Apply minimal theme
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +  # Rotate x-axis labels for readability
  geom_text(aes(label = round(mpd.obs, 2)), vjust = -0.3, size = 3)  # Display MPD values on top of bars

### Step 5: Midpoint rooting of the phylogenetic tree----
rooted_tree <- midpoint(pruned_tree)  # Root tree at its midpoint

# Verify the rooted tree (ensure it's properly rooted)
is.rooted(rooted_tree)
## [1] TRUE
### Step 6: Calculate Phylogenetic Diversity (PD)----
pd_results <- pd(community_matrix, rooted_tree)  # Calculate phylogenetic diversity using the community matrix

# Print results of phylogenetic diversity
print(pd_results)
##                                         PD SR
## Belize                           0.2477773  2
## Bonaire, Sint Eustatius and Saba 0.1069927  1
## Brazil                           0.1103707  1
## Cayman Islands                   0.1182753  1
## Chinese Taipei                   0.1182753  1
## Colombia                         0.2796840  2
## Costa Rica                       0.6288247  8
## Cuba                             0.2286460  2
## Dominican Republic               0.1612616  1
## Honduras                         0.1182753  1
## Jamaica                          0.1734162  2
## Mexico                           0.3235815  3
## Panama                           0.3747280  4
## Puerto Rico                      0.1417440  1
## Singapore                        0.1182753  1
## Suriname                         0.1470485  1
## United States of America         0.3690062  3
### Step 7: Visualize Phylogenetic Diversity by Country----
# Prepare data for plotting
pd_plot_data <- data.frame(
  Country = rownames(pd_results),
  PD = pd_results$PD,  # Phylogenetic diversity
  SR = pd_results$SR   # Species richness
)

# Create a bar plot to visualize phylogenetic diversity
ggplot(pd_plot_data, aes(x = Country, y = PD)) +
  geom_bar(stat = "identity", fill = "forestgreen") +  # Bar plot for PD values
  labs(title = "Phylogenetic Diversity by Country",
       x = "Country", 
       y = "Phylogenetic Diversity") +
  theme_minimal() +  # Minimal theme for readability
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +  # Rotate x-axis labels
  geom_text(aes(label = round(PD, 2)), vjust = -0.3, size = 3)  # Display PD values on top of bars

### Step 8: Convert tree to treedata object for annotation----
tree_data <- as_tibble(rooted_tree)

# Create community annotation data: assign each species to its most frequent community
community_annotations <- data.frame(
  species = colnames(community_matrix),
  community = apply(community_matrix, 2, function(x) names(which.max(x)))  # Assign each species to the community with the highest frequency
)

# Create a color palette for the communities
n_communities <- length(unique(community_annotations$community))  # Get number of unique communities
community_colors <- setNames(
  brewer.pal(n = max(3, min(n_communities, 9)), "Set3"),  # Choose a color palette with a suitable number of colors
  unique(community_annotations$community)  # Map each community to a color
)

### Step 9: Plot the enhanced phylogenetic tree with community distribution----
p <- ggtree(rooted_tree) %<+% community_annotations +
  geom_tippoint(aes(color = community), size = 3) +  # Plot tips with community colors
  scale_color_manual(values = community_colors) +  # Apply custom colors
  theme_tree2() +  # Enhance tree visualization
  ggplot2::labs(title = "Phylogenetic Tree with Community Distribution",
                color = "Community")  # Label the plot

# Add a scale bar to the tree plot
p <- p + geom_treescale()
plot(p)  # Display the plot

### Step 10: Compute ses.mpd with null model and runs for significance----
ses_mpd_results <- ses.mpd(community_matrix, 
                           cophenetic(rooted_tree),
                           null.model = "taxa.labels",  # Null model based on species labels
                           abundance.weighted = FALSE,  # Do not weight by abundance
                           runs = 999)  # Number of randomizations

# Clean up and filter results (remove missing values)
df_clean <- na.omit(ses_mpd_results)

# Create a data frame for analysis with observed MPD, Z-score, and p-value
mpd_clean <- data.frame(
  Community = rownames(df_clean),
  MPD_Observed = df_clean$mpd.obs,  # Observed MPD
  MPD_Z_Score = df_clean$mpd.obs.z,  # Z-score for observed MPD
  MPD_P_Value = df_clean$mpd.obs.p  # p-value for significance
)

# Interpretation based on Z-scores: Phylogenetic clustering, overdispersion, or random structure
mpd_clean$Interpretation <- case_when(
  mpd_clean$MPD_P_Value < 0.05 & mpd_clean$MPD_Z_Score < -1.96 ~ "Phylogenetic Clustering",  # Clustered if p < 0.05 and Z < -1.96
  mpd_clean$MPD_P_Value < 0.05 & mpd_clean$MPD_Z_Score > 1.96 ~ "Phylogenetic Overdispersion",  # Overdispersed if p < 0.05 and Z > 1.96
  TRUE ~ "Random Phylogenetic Structure"  # Random if p >= 0.05
)

### Step 11: Plotting the Z-scores for interpretation of phylogenetic community structure----
ggplot(mpd_clean, aes(x = reorder(Community, MPD_Z_Score), y = MPD_Z_Score, fill = Interpretation)) +
  geom_bar(stat = "identity") +  # Bar plot for Z-scores
  scale_fill_manual(values = c("blue", "gray", "gray")) +  # Custom fill colors for each interpretation
  labs(title = "Phylogenetic Community Structure",
       x = "Community", 
       y = "MPD Standardized Effect Size") +  # Axis labels
  theme_minimal() +  # Minimal theme for clarity
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +  # Rotate x-axis labels
  geom_hline(yintercept = 0, linetype = "dashed") +  # Add a dashed line at y=0
  geom_hline(yintercept = c(-1.96, 1.96), linetype = "dotted", color = "red")  # Add red lines for Z-scores ±1.96

### Step 12: Additional insights: Calculate summary statistics grouped by interpretation----
summary_stats <- mpd_clean %>%
  group_by(Interpretation) %>%
  summarise(
    Count = n(),  # Number of communities per interpretation
    Mean_Z_Score = mean(MPD_Z_Score),  # Mean Z-score per interpretation
    Mean_P_Value = mean(MPD_P_Value)  # Mean p-value per interpretation
  )

# Print summary statistics
cat("Summary of Phylogenetic Community Structure:\n")
## Summary of Phylogenetic Community Structure:
print(summary_stats)  # Display the summary statistics for interpretation
## # A tibble: 2 × 4
##   Interpretation                Count Mean_Z_Score Mean_P_Value
##   <chr>                         <int>        <dbl>        <dbl>
## 1 Phylogenetic Clustering           1       -2.91         0.005
## 2 Random Phylogenetic Structure     7       -0.158        0.414

The phylogenetic analysis reveals distinct patterns across the Americas, with Costa Rica showing significant clustering (z-score -2.91) and high diversity (0.63), while seven other communities display random phylogenetic structure (mean z-score -0.16) and lower diversity values. These patterns align with geographic distribution, where Costa Rica’s position as a continental bridge corresponds to its unique phylogenetic signature, while island communities show more random structure consistent with their isolated locations. The analysis is validated through multiple independent metrics, including phylogenetic clustering, diversity indices, and spatial patterns.

  1. Visualizations The code generates several key visualizations to analyze the phylogenetic community structure of Anolis species. These include an interactive geographical distribution map, phylogenetic trees annotated with community distributions, and bar plots displaying Phylogenetic Diversity (PD) and Mean Pairwise Phylogenetic Distance (MPD) by country. Additionally, the code creates a standardized effect size (SES) plot for community structure, which highlights clustering patterns using a blue/gray color scheme with reference lines at key z-score thresholds (-1.96, 0, 1.96). The visualizations utilize ggplot2 and ggtree, incorporating color-coding, rotated axis labels, and data value annotations to enhance readability and clarity. These visualizations provide valuable insights into the distribution, evolutionary history, and phylogenetic structure of Anolis species communities.

  2. Results and Discussion Key Findings: The phylogenetic analysis of Anolis species across the Americas revealed notable patterns in community structure and evolutionary relationships. Costa Rica emerged as a key region, showing significant phylogenetic clustering (z-score = -2.91) and the highest phylogenetic diversity (0.63). This indicates that species in Costa Rica are more closely related than expected by chance, suggesting environmental filtering, where specific environmental conditions favor certain evolutionary lineages. In contrast, seven other communities exhibited random phylogenetic structures (mean z-score = -0.16) and moderate-to-low diversity, reflecting more random assembly of species. Continental regions such as the USA, Panama, and Colombia showed diverse evolutionary lineages (phylogenetic diversity 0.26-0.37), while island communities, particularly in the Caribbean, demonstrated low diversity (0.11-0.12) and random structure, likely due to limited colonization opportunities, smaller area, and geographic isolation. This suggests that evolutionary dynamics differ substantially between continental and island habitats.

Geographic and Evolutionary Implications: The spatial patterns observed in the geographic distribution map corresponded strongly with the phylogenetic data. The increasing pairwise distances from south to north, indicating a latitudinal effect on evolutionary relationships, were consistent with the phylogenetic tree’s branching patterns. Costa Rica’s geographic position as a continental bridge helps explain its high phylogenetic diversity and clustering, as species may have experienced less geographic isolation. Conversely, island communities’ random phylogenetic structure aligns with their geographic isolation and the challenges of limited colonization and founder effects. The north-south gradient in pairwise distances further reflects the different evolutionary pressures acting on species across the continent. Together, these results underscore the complex interplay between geography, phylogeny, and environmental factors in shaping biodiversity and community structure. While these findings offer new insights into the evolutionary history of Anolis, the study is limited by the availability of genetic sequence data and potential sampling biases, and future studies could aim to expand genetic sampling and incorporate more ecological variables to refine these conclusions.

  1. Reflection This project has significantly enhanced my understanding of bioinformatics, phylogenetic analysis, and computational biology. By working with complex biological datasets and applying advanced R programming skills, I developed practical experience that will be invaluable for future research. The implementation of sophisticated phylogenetic analyses and community ecology methods has broadened my knowledge, laying a solid foundation for tackling similar challenges in future projects. This experience will be instrumental as I continue to explore and contribute to the fields of bioinformatics and evolutionary biology. Also this would really help manage and handle the data sets to come in BINF*6999 Summer Project

  2. Acknowledgment This project has greatly enhanced my understanding of bioinformatics techniques, phylogenetic analysis, and computational biology. I gained valuable skills in advanced R programming, handling complex biological datasets, and implementing sophisticated phylogenetic analyses. Additionally, I deepened my knowledge of community ecology computational methods. I would like to express my sincere gratitude to Dr. Karl and Brittany for their invaluable help and guidance throughout the course.

  3. References Phylogenetic comparative methods in ecology and evolution (Revell, 2014) Community phylogenetics: concepts and approaches (Webb et al., 2002) Bioinformatics sequence analysis (Durbin et al., 1998)